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Abstract 

We provide a condition under which a version of Shannon's En- 
tropy Power Inequahty will hold for dependent variables. We pro- 
vide information inequalities extending those found in the independent 
case. I 



Shannon's Entropy Power Inequality states that 

Theorem 1 For independent random variables X, Y with densities, the en- 
tropy of the sum satisfies: 

22H{x+Y) y 2'^^(^) -)- 2^-'^('^) 
with equality if and only if X, Y are normal. 

Apart from its intrinsic interest, it provides a sub- additive inequality for 
sums of random variables and is thus an important part of the entropy- 
theoretic proof of the Central Limit Theorem [|[] . Whilst Shannon's proof 
seems incomplete, in that he only checks that the necessary conditions for 
a local maximum are satisfied, a rigorous proof is provided by Stam (see 
Blachman [§]). This proof is based on a related inequality concerning Fisher 
information: 
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Lemma 2 For X, Y with differentiable densities: 



1 1 1 

> 



J{X + Y) - J{X) JiY) 

The relationship between Theorem |] and Lemma |^ comes via de Bruijn's 
identity, which expresses entropy as an integral of Fisher informations. 

Now, Takano provided conditions on the random variables X, F , such that 
Theorem H would still hold for weakly dependent variables. In contrast, we 
change the equation, replacing entropies by conditional entropies, providing 
alternative conditions for this related result. Our approach is again to develop 
a Fisher information inequality, and to use an integral form of that to deduce 
the full result. 

We consider random variables X, Y with joint density y) and marginal 
densities Px{,x),pY{y)- We need to refer to score functions and Fisher in- 
formations. Write pxix) = p'x{x)/Px{x) and py(y) = Py(x)/py(x). We 
write p^^\x,y) for dp{x,y) /dx, and similarly for p^'^\x,y), and p^^'>{x,y) = 
p^^\x,y)/p{x,y), and similarly for p^'^\x,y). Now, we can define J{X) = 
"KpxiXY and J{Y) = Epy(y)^ for the Fisher informations of X and Y, and 
Jxx = Er/io«(X,F)2, Jyy = Ep(2)(x,F)2, JxY = Ep«(X,F)p(2)(x,r), 
similarly. We will need to consider terms of the form: Mafi{x, y) = a{p^-^^ (x, y)- 
pxix)) + b{p^'^\x,y) - priy)). 

Lemma 3 (Takano) As in the independent case, we can express the score 
function pw of the sum W = X + Y as a conditional expectation of px{X, Y). 



pw{w) = E(p(2) (X, Y)\X + Y = w) = E(p(2) (X, Y)\X + Y 



w] 



Proof Since W = X + Y has density pw{ui) = f p{x, w — x)dx = f p{w — 
y,y)dy, we know that: 

P'wi^) fP^^\w-y,y) f p^^\w -y,y)p{w -y,y) 

Pz{z) = ^ = / ^ dy = / — ^ r^^y^ 

Pw{z) J Pw{w) J p{w-y,y) pw{w) 

hence the result follows. □ 

Using this, we establish the following proposition, the equivalent of Lemma 
for dependent variables, and which reduces to Lemma |^ in the independent 
case: 
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Proposition 4 For random variables X ,Y with differentiable densities: 

1 1 1 

> ^ + 



J {X + Y) — JxY J XX — JxY JyY — JxY 

Equality holds when X, Y are multivariate normal. 

Proof Using the conditional representation, Lemma ^ for any a, h: 

< E(apW(X,r) + 6p(2)(x,r)-(a + 6)p(X + r))^ 
= a^Jxx + 2ahJxY + &Vyy - (a + + Y). 

Now, motivated by the choice of a, 6 that give equahty in the Gaussian case, 
we take a = Jyy — Jxy, b = Jxx — Jxy, and rearranging, we obtain that: 



j{x + Y)< J^^Jyy-JxY 



Jxx + Jyy — '^JxY 
and subtracting Jxy from both sides we obtain the result. □ 



Lemma 5 // (Xt, Fj) = X + Zct, where Zct ~ iV(0, Ct), and Wt = Xt + Yt 
then writing a = Jxx — Jxy, b = Jyy — Jxy-' 

— {m{Xu Yt) - m{y\it)) > > o. 

ot a + b 

Proof Johnson and Suhov prove the multivariate de Bruijn identity: 

dH 



dt' ' 2 

where J is the Fisher matrix ^p^p, with p, the score vector equal to Vf/f. 
By Proposition ^ we deduce that: 

^^{2H{Xt,Yt)-2H{Wt)) 

= CiiJxx + ^CuJxY + C22JYY — (Cii + 2C12 + C22)JiWt) 
> CiiJxx + 2C12JXY + C22JYY 

JxxJyy — Jxy 



-(Cn + 2Ci2 + C: 



22 J 



Jxx + Jyy — 2JxY 
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□ 

Now, for functions f{t),g{t), we can define {Xt,Yt) = {X,Y) + {Zi,Z2), 
where Zi,Z2 are independent, with Zi ~ N{OJ{t)), Z2 ~ N{0,g{t)). We 
write vxt, Pxt Pxt for the variance, density and score function of Xf. 
This perturbation ensures that densities are smooth and allows us to use the 
2-dimensional version of the de Bruijn identity: 

Condition 1 For allt, ¥.px,{Xt)pYAYt) > 0. 

Compare Condition ^with Takano's condition 0, involving the same term: 



Condition 2 For allt, Epx,{Xt)pYAYt) > ^Ml^_,, where X = yj^j^- 

Takano shows that Condition ^ implies that the original Entropy Power In- 
equality, Theorem |l], holds. With our weaker condition, we provide a weaker, 
though still interesting, result. 

Theorem 6 (Conditional Entropy Power Inequality) // Condition [J 
holds then: 

22H{x+Y) 2'^^i^\'>^) _|_ 22-'^("*^l^) 
Proof Taking f,g defined by /' = 22^(^*1^*), g' = 22^(^*1^*) and defining 

At) 

= 2^ i^^^T^ + f'9'EMl_, + 2rg'Epx,py}l > 0, 

since < EMf ^i = Jxx - 2Jxy + Jyy - J{Xt) - J{Yt) - 2Epx,pYf Hence 
s{t) is an increasing function of t. Now as t ^ 00, s(t) 1, since (X, Y) 
tends to an independent pair of normals. Hence s(0) < 1 and the result 
follows. □ 
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Cover and Zhang provide a bound on the entropy H{X + Y), under the 
condition that X and Y have the same marginal density /. They show that 
H{X + Y) < H{2X) if and only if / is log-concave (that is, the score function 
is decreasing). Notice that our Condition |l| holds if X, Y are FKG variables 
with log-concave densities. 

We write '?/'(X, F) = sup^y\px,Y{x,y)/px{x)pY{y) — 1|, the so-called ip- 
mixing coefficient. Note that since 

EpxAXt)pYt{Yt) 

VXtVYt 

xy \ 

pXt{x)pYt{y) {PXt,Yt{x,y) - Pxt{x)pYt{y)) dxdy 

VXtVYt ) 

> -ij{XuYt)^JJ{Xt)J{Yt) - {vx,VY,)-\ 




Condition will hold if for all t: 

Cov(X, Y) = Cov(Xi, Yt) > vx.vrMXt, Yt)^vxJ{Xt)vYJiYt) - 1. 

Now, by Lemma | J{Xt) < 1/{J{X)~^ + /(t)), we know that if X,Y have 
the same marginals (and wlog variance 1), then vxtJ{^t)vYtJ(Xt) — 1 < 
(J2(X) - 1)/(1 + f{t)J{X)). Thus, we require that: 

Cov(x,r) v^i3||W > ^ix„Y,mx) - 1). 

From / = 0, we deduce that we need J{X) < ^Cov(X, Y) /ij{X, Y) + 1, 
and if lim^^oo fity/^Xt, Yt) = 0, we are through. 

Although we know that tp{Xt,Yt) < iIj{X,Y), we need some theory of con- 
vexity of mixing coefficients to provide the most natural conditions. 
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